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Abstract In this lecture we clarify the basic difference between the correlation 
properties for systems characterized by small or large fluctuations. The 
concepts of correlation length, homogeneity scale, scale invariance and 
criticality are discussed as well. We relate these concepts to the inter- 
pretation of galaxy clsutering. 
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1. INTRODUCTION 

The existence of large scale structures (LSS) and voids in the distri- 
bution of galaxies up to several hundreds Megaparsecs is well known 
from twenty years [1, 2]. The relationship of these structures with the 
statistics of galaxy distribution is usually inferred by applying the stan- 
dard statistical analysis as introduced and developed by Peebles and 
coworkers [3]. Such an analysis assumes implicitly that the distribution 
is homogeneous at very small scale (Aq ~ 5-i- lQh~^Mpc). Therefore the 
system is characterized as having small fluctuations about a finite aver- 
age density. If the galaxy distribution had a fractal nature the situation 
would be completely different. In this case the average density in finite 
samples is not a well defined quantity: it is strongly sample-dependent 
going to zero in the limit of an infinite volume. In such a situation 
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it is not meaningful to study fluctuations around the average density 
extracted from sample data. The statistical properties of the distribu- 
tion should then be studied in a completely different framework than the 
standard one. We have been working on this problem since some time [4] 
by following the original ideas of Pietronero [5] . The result is that galaxy 
structures are indeed fractal up to tens of Megaparsecs [6]. Whether a 
crossover to homogeneity at a certain scale Aq, occurs or not (corre- 
sponding to the absence of voids of typical scale larger than Aq) is still a 
matter of debate [7]. At present, the problem is basically that the avail- 
able rcdshift surveys do not sample scales larger than 50 -v- 10Qh~^ Mpc 
in a wide portion of the sky and in a complete way. 

In this lecture we try to clarify some simple and basic concepts like 
the proper definition of correlation length, homogeneity scale, average 
density and scale invariance. We point out that a correct defintion and 
intepretation of the above concepts is necessary in order to understand 
phenomenologically the statistical properties of galaxy structures and to 
define the correct theoretical questions one would like to answer for. 



Consider a statistically homogeneous and isotropic particle density 
n{r) with or without correlations with a well defined average value uq. 
Let 



be the number density of points in the system (the index i runs over 
all the points) and let us suppose to have an infinite system. Statistical 
homogeneity and isotropy refer to the fact that any n-point statistical 
property of the system is a function only on the scalar relative distances 
between these n points. The existence of a well defined average density 
means that 



(where ||C(i?)|| = AttR^/S is the volume of the sphere C{R)) indepen- 
dently of the origin of coordinates. The scale Aq, such that the one point 
average density is well-defined, i.e. 



2. 



DISTRIBUTION WITH SMALL 
FLUCTUATIONS 




(1) 






rn{r)/\\C{R)\\ -no < m for r > Aq , 



(3) 
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is called homogeneity scale. If n(r) is extracted from a density ensemble, 
no is considered the same for each realization, i.e. it is a self-averaging 
quantity. 

Let (F) be the ensemble average of a quantity F related to n(r). If 
only one realization of n(r) is available, (F) can be evaluated as an 
average over all the different points (occupied or not) of the space taken 
as origin of the coordinates. The quantity 

{n{fi)n{f2)...n{fi)) dVidV2...dVi 

gives the average probability of finding I particles placed in the infinites- 
imal volumes dVi,dV2, .-.jdVi respectively around rl, r2, n. For this 
reason (n(ri)n(r2)...n(n)) is called complete l-point correlation function. 
Obviously (n(r)) = no, and in a single sample such that V^^^ ^ Ao, it 
can be estimated by 

nv = N/V (4) 

where N is the total number of particle in volume V. 

Let us analyze the auto-correlation properties of such a system. Due 
to the hypothesis of statistical homogeneity and isotropy, {n{fl)n{f2)) 
depends only on ri2 = In — r^]- Moreover, {n{fi)n {1^)71(1%)) is only 
a function of ri2 = |rl — r2|, r23 = |r2 — rsj and ri3 = |rl — rsj. The 
reduced two-point and three correlation functions ^(r) and C{ri2, ^"231 J'ls) 
are respectively defined by: 

(n(rl)n(r2)) = ng [1 + Ciru)] (5) 
(n(rl)n(r2)n(r5)) =nl[l + ^{ru) + ^(r23) + ^(na) + C(n2, r-23, ns)] . 

The reduced two-point correlation function ^(r) defined in the previous 
equation is a useful tool to describe the correlation properties of small 
fluctuations with respect to the average. However we stress again that 
in order to perform a statistical analysis following Eqs.5, the one-point 
average density should be a well defined quantity and this must be care- 
fully tested in any given sample (see below). We define 

^ = {N{R)) 

to be the mean square fluctuation normalized to the average density. 
Prom the definition of Aq, we have 

^^'(Ao) =^ 1 (7) 

and cr^{R) <C 1 for r > Ao. Note that cr^{R) is again related to the 
one-point property of the distribution. We stress that the defintion of 
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the homogeneity scale via Eq.7 can be misleading in the case where the 
average density is not a well-defined concept (see next section). Indeed, 
in such a case the quantity {N{R)) at the denominator is not given by 
{N{R)) = no X ||C(i?)|| ~ R^: the scaling exponent is indeed different 
from the Euclidean dimension of the space d = 3. 

In order to analyze observations from an occupied point it is necessary 
to define another kind of average: the conditional average {F)^^ which 
characterizes the two-point properties of the system. This is defined as 
an ensemble average with the condition that the origin of coordinates is 
an occupied point. When only one realization of n{f) is available, (F),^ 
can be evaluated averaging the quantity F over all the occupied points 
taken as origin of coordinates. The quantity 

{n{n)n{r2)...nin))p dYM-.-dVi (8) 

is the average probability of finding I particles placed in the infinites- 
imal volumes dVi,dV2, ...,dVi respectively around fi,f2, ...,fi with the 
condition that the origin of coordinates is an occupied point. We call 
{n{fi)n{'f2)...n{'ri))p conditional Z-point density. Applying the rules of 
conditional probability [8] one has: 

^/ N , /^Nv {n(0)n(r)) 

r(r) ^ (n(f))p = ^-^f^ (9) 
no 

(n(ri)n(r2))p = . 

no 

where r(r) is called the conditional average density [5]. 

However, in general, the following convention is assumed in the defi- 
nition of the conditional densities: the particle at the origin does not ob- 
serve itself. Therefore (n(r))p is defined only for r > 0, and {n{fi)n{f2))p 
for ri,r2 > 0. In the following wc use this convention as corresponding 
to the experimental data in galaxy catalogs. 

We have defined above the homogeneity scale by means of the one- 
point properties of the distribution. Here we may define it in another 
wa by looking at the two-point properties: If the presence of an object 
at the point n influences the probability of finding another object at r2, 
these two points are correlated. Hence there is a correlation at the scale 
distance r if 

G{r) = (n(0)n(r)) ^ {nf . (10) 
On the other hand, there is no correlation if 



G{r) = (n)2 > . 



(11) 
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Therefore the proper definition of Aq, the homogeneity scale, is the length 
scale beyond which G{r) or equivalently r(r) become nearly constant 
with scale and show a well-defined flattening. If this scale is smaller 
than the sample size then one may study for instance the behaviour of 
cr^{R) with scale (Eq.6) in the sample. 

The length-scale Aq represents the typical dimension of the voids in 
the system. On the other hand there is another length scale which is 
very important for the characterization of point spatial distributions: 
the correlation length re- The length Tc separates scales at which den- 
sity fluctuations are correlated (i.e. probabilistically related) to scales 
where they are uncorrelated. It can be defined only if a crossover to- 
wards homogeneity is shown by the system, i.e. if Aq exists [9]. In other 
words Tc defines the organization in geometrical structures of the fluctu- 
ations with respect to the average density. Clearly Tc > Aq: only if the 
average density can be defined one may study the correlation length of 
the fluctuations around it. Note that rc is not related to the absolute 
amplitude of fluctuations, but to their probabilistic correlation. In the 
case in which Aq is finite and then (n) > 0, in order to study the cor- 
relations properties of the fluctuations around the average and then the 
behaviour of rc, we can study the reduced two-point correlation function 
^(r) defined in Eq.5. 

The correlation length can be defined through the scaling behavior of 
^(r) with scale. There are many definitions of Vc, but in any case, in 
order to have Tc finite, ^(r) must decay enoughly fast to zero with scale. 
For instance, if 

|^(r)| ^ exp(— r/rc) for r — > oo , (12) 

this means that for r ^ rc the system is structureless and density fluc- 
tuations are weakly correlated. The definition of the correlation length 
rc by Eq.l2 is equivalent to the one given by [9]. 

2.1. SOME EXAMPLES 

Let us consider some simple examples. The first one is a Poisson 
distribution for which there are no correlation between different points. 
In such a situation [10] the average density is well defined, and 

^(r-12) = 5(rl - r2)/no . (13) 
Analogously, one can obtain the three point correlation functions: 

C{ri2,r23,ri3) = S{r1 - ri)5(r2 -r~i)/nl . (14) 

The two previous relations say only that there is no correlation between 
different points. That is, the reduced correlation functions ^ and ( have 
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only the so called "diagonal" part. This diagonal part is present in 
the reduced correlation functions of any statistically homogeneous and 
isotropic distribution with correlations. For instance [11] ^(r) in general 
can be written as ^(r) = 6{f)/nQ + h{r) , where h{r) is the non-diagonal 
part which is meaningful only for r > 0. Consequently, we obtain for 
the purely Poisson case (remember that conditional densities are defined 
only for points out of the origin): 

(n(r))p = no (15) 
(n(rl)n(ri))p = nl[l + 6{f{ - r^)/no] . 

The second example is a distribution which is homogeneous but with 
a finite correlation Icnght re- In such a situation T{r) has a well-defined 
flattening and one may study the properties of ^(r). The correlation 
length Vc is usually defined as the scale beyond which ^(r) is exponen- 
tially damped. It measures up to which distance density fluctuations 
density are correlated. Note that while Aq refers to an one-point prop- 
erty of the system (the average density) , Tc refers to a two-points prop- 
erty (the density-density correlation) [9, 12]. In such a situation ^(r) is 
in general represented by 

^(r) = ^exp(-r/re) (16) 

where ^ is a prefactor which basically depends on the homoegeneity scale 
Aq- We remind that Aq gives the scale beyond which (t'^{R) ^ 1 (Eq.6), 
and not the scale beyond which density fluctuations are not correlated 
anymore. This means that the typical dimension of voids in the system 
is not larger than Aq, but one may find structures of density fluctuations 
of size up to Tc > Aq- 

Finally, let us now consider a mixed case in which the system is ho- 
mogeneous (i.e. Aq is finite), but it has long-range power-law correla- 
tions. This means that fluctuations around the average, independently 
on their amplitude, are correlated at all scales, i.e. one finds structures 
of all scales. However, we stress again these are structures of fluctuations 
with respect to a mean which is well defined. This last event is in general 
described by the divergence of the correlation length Vc- Therefore let us 
consider a system in which {n{f)) = uq > and ^(r) = [S{f)]/no + h{r), 
with 

]/i(r)|~r"T forr>Ao, (17) 

and < 7 < 3. For 7 > 3, despite the power law behavior, ^(r) is 
integrable for large r, and depending on the studied statistical quantity 
of the point distribution, we can consider the system as having a finite 
Vc (i.e. behaving like an exponentially damped ^{r)) or not. Eq. 17 
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characterizes the presence of scale-invariant structures of fluctuations 
with long-range correlations, which, in Statistical Physics is also called 
"critical" [13]. 

3. DISTRIBUTION WITH LARGE 
FLUCTUATIONS 

A completely different case of point distribution with respect the ho- 
mogeneous one with or without correlations is the fractal one. In the 
case of a fractal distribution, the average density (n) in the infinite sys- 
tem is zero, then G{r) = and Aq = oo and consequently ^(r) is not 
defined. For a fractal point distribution with dimension D < 3 the con- 
ditional one-point density (n(r))p (which is hereafter called r(r)) has 
the following behavior [4] 

(n(r-))p ^ r(r) = Br^-' , (18) 

for enough large r. The intepretation of this behavior is the following. 
We may compute the average mass-length relation from an occupied 
point which gives the average numebr of points in a spherical volume of 
radius R centered on an occupied point: this gives 

{N{R))p = {4ttB)/D X i?^ , (19) 

The constant B is directly related to the lower cut-off of the distribution: 
it gives the mean number of galaxies in a sphere of radius Ih^^Mpc. 
Eq.l9 implies that the average density in a sphere of radius R around 
an occupied point scales as 1/R^~^. Hence it depends on the sample 
size R, the fractal is asymptotically empty and thus Aq —>■ oo. We have 
two limiting cases for the fractal dimension: (1) D = means that there 
is a finite number of points well localized far from the boundary of the 
sample (2) D = 3 the distribution has a well defined positive average 
density, i.e. the conditional average density does not depend on scale 
anymore. Given the metric interpretation of the fractal dimension, it 
is simple to show that < D < 3. Obviously, in the case D = 3 for 
which Aq is finite r(r) provides the same information of G(r), i.e. it 
characterizes the crossover to homogeneity. 

A very important point is represented by the kind of information 
about the correlation properties of the infinite system which can be ex- 
tracted from the analysis of a finite sample of it. In [5] it is demonstrated 
that, in the hypothesis of statistical homogeneity and isotropy, even in 
the super-correlated case of a fractal the estimate of T{r) extracted from 
the finite sample of size Rg, is not dependent on the sample size Rg, pro- 
viding a good approximation of that of the whole system. Clearly this is 
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true a part from statistical fluctuations [10] due to the flniteness of the 
sample. In general the r(r) extracted from a sample can be written in 
the following way: 

r(0 = ^ g f n{n + r-')dV', (20) 

where N is the number of points in the sample, n{fi + r') is the number 
of points in the volume element d?r' around the point r j + r* and Ar is 
the thickness of the shell at distance r from the point at rj. Note that 
the case of a sample of a homogeneous point distribution of size F ^ Aq, 
must be studied in the same framework of the fractal case. 

4. PROBLEMS OF THE STANDARD 
ANALYSIS 

In the fractal case {V^/^ <C Aq), the sample estimate of the homo- 
geneity scale, through the value of r for which the sample-dependent 
correlation function ^(r) (given by Eq.21) is equal to 1, is meaningless. 
This estimate is the so-called "correlation length" ro [3] in the standard 
approach of cosmology. As we discuss below, ro has nothing to share 
with the true correlation length Tc. Let us sec why ro is unphysical in the 
case V^/'^ -C Aq. The length ro [3] is defined by the relation ^(ro) = 1, 
where ^(r) is given operatively by 

T(r) 

= - 1 ■ (21) 

where ny is given by Eq.4. What does ro mean in this case ? The 
basic point in the present discussion [5], is that the mean density of the 
sample, ny, used in the normalization of ^(r), is not an intrinsic quantity 
of the system, but it is a function of the finite size Rg of the sample. 

Indeed, from Eq.l8, the expression of the ^(r) of the sample in the 
case of fractal distributions is [5] 

D / r \^-^ 

being Rg the radius of the assumed spherical sample of volume V. From 
Eq.22 it follows that ro (defined as ^(ro) = 1) is a linear function of the 
sample size Rg 

D\ 3^ 



r-o = [jj Rs (23) 

and hence it is a spurious quantity without physical meaning but it is 
simply related to the sample's finite size. In other words, this is due to 



REFERENCES 



9 



the fact that ny in the fractal case is in any finite sample a had estimate 
of the asymptotic density which is zero in this case 

We note that the amplitude of r(r) (Eq.l8) is related to the lower 
cut-off of the fractal, while the amplitude of ^(r) is related to the upper 
cut-off (sample size Rg) of the distribution. This crucial difference has 
never been appreciated appropriately. 

Finally, we stress that in the standard analysis of galaxy catalogs the 
fractal dimension is estimated by fitting ^(r) with a power law, which 
instead, as one can see from Eq.22, is power law only for r ^ ro (or 
^ 3> 1). For distances around and beyond rg there is a clear deviation 
from the power law behavior due to the definition of ^(r). Again this 
deviation is due to the finite size of the observational sample and does 
not correspond to any real change in the correlation properties. It is 
easy to see that if one estimates the exponent at distances r < tq, one 
systematically obtains a higher value of the correlation exponent due to 
the break in ^(r) in a log-log plot. 

5. DISCUSSION AND CONCLUSION 

From an operative point of view, having a finite sample of points (e.g. 
galaxy catalogs), the first analysis to be done is the determination of 
r(r) of the sample itself. Such a measurement is necessary to distin- 
guish between the two cases: (1) a crossover towards homogeneity in 
the sample with a flattening of r(r), and hence an estimate of Aq < Rs 
and (n); (2) a continuation of the fractal behavior. Obviously only in 
the case (1), it is physically meaningful to introduce an estimate of the 
correlation function ^(r) (Eq.21), and extract from it the length scale tq 
(C(^o) = 1) to estimate the intrinsic homogeneity scale Aq- In this case, 
the functional behavior of ^(r) with distance gives instead information 
on the correlation length of the density fluctuations. Note that there 
are always subtle finite size effects which perturb the behvaiour of ^(r) 
for r ~ and which must be properly taken into account. These 

same arguments apply to the estimation of the power spectrum of the 
density fluctuations, which is just the fourier conjugate of the correlation 
function [4] . The application of these concepts to the case of real galaxy 
data can be found in [4, 6, 10]. 
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